Challenges in Information Extraction from Tables in Biomedical Research Publications: a Dataset Analysis
نویسندگان
چکیده
We present a study of a dataset of tables from biomedical research publications. Our aim is to identify characteristics of biomedical tables that pose challenges for the task of extracting information from tables, and to determine which parts of research papers typically contain information that is useful for this task. Our results indicate that biomedical tables are hard to interpret without their source papers due to the brevity of the entries in the tables. In many cases, unstructured text segments, such as table titles, footnotes and non-table prose discussing a table, are required to interpret the table’s entries.
منابع مشابه
Extracting information from textual documents in the electronic health record: a review of recent research.
OBJECTIVES We examine recent published research on the extraction of information from textual documents in the Electronic Health Record (EHR). METHODS Literature review of the research published after 1995, based on PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers already included. RESULTS 174 publications were selected an...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملA Bibliometric Analysis of Toxicology Publications of Iran and Turkey in ISI Web of Science
Background: Web of Science (WoS) is an online academic citation index provided by Thomson Reuters which supplies valuable bibliometric information for comparing impact of specific author, organization, or country in science production. The aim of this study was to compare toxicology publications of Iran and Turkey indexed in WoS from bibliometric point of view. Methods: The WoS database was ...
متن کاملA PCA/ICA based Fetal ECG Extraction from Mother Abdominal Recordings by Means of a Novel Data-driven Approach to Fetal ECG Quality Assessment
Background: Fetal electrocardiography is a developing field that provides valuable information on the fetal health during pregnancy. By early diagnosis and treatment of fetal heart problems, more survival chance is given to the infant.Objective: Here, we extract fetal ECG from maternal abdominal recordings and detect R-peaks in order to recognize fetal heart rate. On the next step, we find a be...
متن کاملAnalysis of Scientific Publications in the Field of Ethics in Accounting
Background: Scientific articles represent the efforts of researchers and are useful and valuable source of information and can be taken as a basis for scientific and performance analysis. The purpose of this research is to study the scientific production of the subject area of ethics in accounting. Method: This descriptive-analytical research examined 145 articles of the subject area of ethics ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014